Multichannel digital speech synthesizer

ABSTRACT

A multichannel digital speech synthesizer comprises a pulse generator storing periodic and aperiodic excitation signals to be processed in a lattice filter according to weighting parameters, such as gain and reflection coefficients, transmitted from a computer via a control unit and a plurality of input modules assigned to respective output channels. Each input module includes a resettable counter for timing the emissions of periodic or aperiodic excitation signals, to generate a voiced or an unvoiced speech element, and for requesting a new set of parameters from the computer upon detecting the end of a validity interval for a current set of parameters; the module further comprises a pair of buffer memories alternating in reading and writing operations under the control of the counter to ensure a continuous flow of parameter sets to the filter.

FIELD OF THE INVENTION

Our present invention relates to a digital synthesizer of sound wavesfor electronically producing artificial speech.

BACKGROUND OF THE INVENTION

In the field of telecommunications, the synthesis of speech is ofparticular interest. It permits people unskilled in computer technologyto receive so-called canned messages, e.g. by telephone, without thenecessity of employing full-time human operators or of using costlysubscriber terminals. Such messages may inform a calling subscriber ofcongestion at an exchange, of the cost and duration of a call, and of achanged directory number.

A digital system for synthesizing speech stores words or portions ofwords in coded form, a decoder being necessary to convert the digitallyencoded signals into voice signals suitable for conventionaltransduction into sound waves. One particular system for the synthesisof speech elements stores PCM-coded waveform samples of diphones, i.e.phoneme pairs. Such a system generates a staccato-sounding speech andhas the further disadvantage of requiring a large memory.

In an attempt to achieve natural-sounding synthesis, coding techniqueshave been developed on the basis of mathematical models simulating theproduction of speech by a human vocal tract. According to one model, thevocal tract is replaced by the combination of an excitation generatorand a time-variable filtering system consisting of the resonant cavitiesof an acoustic tube having a variable cross-section. The excitation maybe a sequence of periodic or pseudorandom pressure variations, dependingon whether the output is to correspond to a voiced or an unvoiced sound.The filter has coefficients which represent the effects of reflectionbetween different cavities of the tube and are continuous functions oftime; the coefficient values, however, may be considered to be constantduring sufficiently short time intervals, e.g. on the order of 10 msec.Furthermore, the filter can be controlled to have a variable gaincorresponding to a varying sound intensity.

Thus, an element of synthesized speech may be represented by a set ofparameters coding the duration of the element, the kind of excitation(whether voiced or unvoiced), filter gain, weighting coefficients and,in the case of voiced sound, the recurrence period of the excitationpulses. These parameters are obtained by analyzing human speech inaccordance with the selected model. Such an analysis is described by P.M. Bertinetto, C. Miotti, S. Sandri and E. Vivalda in a paper titled "AnInteractive Synthesis System for the Detection of Italian ProsodicRules", CSELT Technical Reports, vol. V, No. 5, December 1977. Priorsynthesizers operating according to this model, however, vary thecoefficients at constant intervals, thereby producing a degree ofunnaturalness in the synthesized speech.

OBJECTS OF THE INVENTION

The object of our present invetion is to provide an improved speechsynthesizer of the type referred to.

SUMMARY OF THE INVENTION

A digital speech synthesizer according to our present inventioncomprises signal-generating means delivering excitation pulses ofvarying amplitudes and polarities to a lattice filter for producingdigital speech samples in response thereto. A digital-to-analogconverter at the output of the filter translates the speech samples intovoice signals. A computer of other programmed message source stores setsof processing parameters transmittable, in a predetermined sequence, tothe signal-generating means for commanding the emission of theexcitation pulses, and to the filter for controlling the processing ofthese pulses thereby; the processing parameters represent codedinformation relating to frequency distribution, volume and duration ofspeech elements such as diphones. An input unit, which may be one ofseveral identical modules, operatively connects the signal-generatingmeans and the filter to the message source for producing consecutivespeech elements of a voice signal coded by the parameter-set sequence.The input unit includes counting means for controlling the respectiveduration of each speech element according to counter settingstransmitted by the message source together with the processingparameters, these setting establishing different counts of validityintervals for the respective parameter sets A time base correlates theoperation of the filter, the input unit and the signal generator.

According to another feature of our present invention, thesignal-generating means includes a first generator adapted to emitperiodic excitation pulses, i.e. digitized amplitude samples ofalternating waveforms to produce voiced elements, and a second generatoradapted to emit aperiodic excitation signals, i.e. constant-amplitudepulses free from recognizable periodicity, to produce unvoiced elementsof synthesized speech. The parameters from the message source include adiscriminating signal for the selective enablement of one or the othergenerator, which may be a read-only memory, according to the nature ofthe sound to be generated.

Preferably, the synthesizer according to our present invention includesa plurality of input units of the aforedescribed type each associatedwith a respective output channel, the time base being connected to theinput units for individually activating them one at a time. In such acase, the excitation-pulse generators and the filter are controlled bythe time base to operate in a time-division mode for establishing timeslots respectively allocated to the several input units.

According to another feature of our present invention, the countingmeans of each input unit include two distinct counters, namely avalidity-interval counter and a sound-interval counter. The latter ispreloaded with a setting or preliminary count to be progressivelydecremented for measuring the length of an operating period for eitherthe periodic-signal or the aperiodic-signal generator, depending on thenature (voiced or unvoiced) of the sound. A control unit advantageouslyforms an interface between the message source and the input units fortemporarily storing parameter-set requests therefrom and fordistributing parameter sets from that source to respective input unitsselected according to programmed address information. Each input unitmay further include a pair of buffer memories for temporarily andalternatively storing successive parameter sets from the messsagesource, the validity-interval counter being connected to these buffermemories for enabling an interchange of reading and writing functionstherebetween upon detecting the termination of a current validityinterval and for receiving upon such interchange, from whichever ofthese memories is enabled for reading, a counter setting determining theduration of the next validity interval.

According to yet another feature of our present invention, a switchoperating in response to the aforementioned discriminating signal fromthe buffer memory enabled for reading controls the preloading of thesound-interval counter with unvoiced-interval settings equal to theencoded contents of the validity-interval counter or with pitch-periodsettings (i.e. a count of the cycle length of the fundamental soundfrequency) from the enabled memory, these settings representing codedfrequency characteristics of speech elements. An additional memorytemporarily stores weighting coefficients and sound-intensity datatransmitted from the read-enabled buffer memory in response to a readingsignal generated by the sound-interval counter upon detecting thetermination of a current sound interval; the additional memory isconnected to the time base and to the filter for transmitting theweighting coefficients thereto in response to clock signals from thetime base.

Pursuant to further features of our present invention, the control unitincludes a logic network for enabling the transfer of a parameterrequest from an input unit to the message source only upon receivingtherefrom consent signals indicating completion of an ongoingtransmission of a parameter-set sequence to such input unit. A registertemporarily stores the arriving parameters while a series-to-parallelconverter decodes address signals from the message source to enable thetransmission of the parameters from the register to a selected inputunit. A parallel-to-series converter encodes the addresses ofrequest-emitting input units, these addresses being temporarily storedin a read/write memory prior to their emission to the message source inresponse to a consent signal therefrom.

The lattice filter used in our improved speech processor may comprise adigital multiplier, a digital adder and a data store together generatinga digital speech sample as a sum of terms including an excitation sampleweighted by a sound-intensity coefficient and at least one term formedas a product of a reflection coefficient and a preceding digital speechsample. For the theoretical principles underlying the operation of sucha filter, reference may be made to an article titled "Digital Latticeand Ladder Filter Synthesis" by A. H. Gray and John D. Markel, IEEETransactions on Audio and Electroacoustics, Vol. AU-21, No. 6, December1973, pages 491-500.

BRIEF DESCRIPTION OF THE DRAWING

The above and other features of our present invention will now bedescribed in detail, reference being made to the accompanying drawing inwhich:

FIG. 1 is a block diagram of a multichannel digital speech synthesizeraccording to our present invention, including a lattice filteroperatively connected to a processor via a control interface and n inputmodules;

FIG. 2 is a block diagram of the control unit or interface illustratedin FIG. 1;

FIG. 3 is a block diagram of an input module shown in FIG. 1;

FIG. 4 is a hypothetical diagram illustrating the principle of operationof the filter of FIG. 1;

FIG. 5 is a block diagram showing the structure of the filter of FIG. 1;

FIG. 6 is a graph of binary signals for controlling and synchronizingthe operations of the synthesizer of FIG. 1; and

FIG. 7 is a graph of durations of parallel operating states of an inputmodule shown in FIGS. 1 and 3.

SPECIFIC DESCRIPTION

FIG. 1 shows a multichannel digital speech synthesizer SIN connected toan external message source UE such as a computer or programmer forreceiving therefrom sets of parameters coding information related tofrequency distributions, intensity levels and durations of consecutivespeech elements. The synthesizer comprises, according to our presentinvention, a lattice filter TV processing excitation pulses to producedigital speech samples transmitted over a lead 41 to a digital-to-analogconverter MU for translation into voice signals and distribution over noutgoing signal paths in the form of transmission lines u_(a) . . .u_(n). Converter MU is an output unit advantageously consisting of n D/Astages and a series-to-parallel decoder (not shown) distributing theretotime-division-multiplexed signals arriving from filter TV.

Filter TV receives excitation pulses via an input lead 40 extending froma signal generator GE which includes a pair of read-only memories EP andEC functioning respectively as a periodic-signal emitter andaperiodic-signal emitter designed to supply filter TV with pulse trainsprocessed thereby into digital speech samples convertible by unit MUinto voiced and unvoiced elements of synthesized speech. Binary-codedsignals arriving from an input module IN_(a), IN_(b), . . . IN_(n) viarespective lead groups 8a, 8b, . . . 8n, merging in a common multiple 8,represent a pitch-period parameter T characterizing the fundamentalfrequency of a voiced speech element. In response to these signals,read-only memory EP emits a train of T pulses including a first pulsehaving a positive polarity and a magnitude √T-1 and (T-1) pulses havinga negative polarity and a magnitude 1/√T-1. Thus, the train of T pulsesgenerated by memory EP, e.g. at a cadence of 8 KHz, forms an excitationsignal having a zero mean value and unitary power whereby variations inthe d-c voltage level between successive sound elements are eliminatedand the sound intensity or volume becomes precisely controllableaccording to a gain coefficient G (see FIG. 4) transmitted from computerUE to filter TV via input modules IN_(a), IN_(b), . . . IN_(n), asdescribed more fully hereinafter with reference to FIGS. 4 and 5.

Read-only memory EC generates trains of pulses of unitary magnitude andpseudo-random polarity. Each train constitutes an excitation signal ofunitary power and substantially zero mean value. The periodicity of thepulse sequence will be practically imperceptible if that sequence is ofsufficiently great length, e.g. of the order of 2¹⁰ pulses.

Memories EP and EC are selectively connectable to filter TV by anelectronic switch S₁ under the control of a signal transmitted from aninput module IN_(a) -IN_(n) over a wired-OR connection comprising leads7a, 7b, . . . 7n and a common conductor 7. Modules IN_(a) -IN_(n) alsotransmit to filter TV, over respective leads 9a, 9b, . . . 9n and acommon conductor 9, the coded values of multiplicative reflectioncoefficients K₁, K₂ etc. (FIG. 4) and of the gain coefficient G whichare used by filter TV in processing the excitation signals fromgenerator GE. The number of reflection coefficients K₁, K₂ etc. dependson the number of functional cells in filter TV, i.e. on the number ofrecursive digital algebraic operations performed by the filter for eachspeech sample emitted to converter MU, as described in detailhereinafter with reference to FIGS. 4 and 5. Associated with eachexcitation pulse transmitted over lead 40 to filter TV is a respectiveset of weighting coefficients G, K₁, K₂ etc. These coefficients,together with a discriminating bit carried by conductor 7, the signalscoding the pitch period T (on multiple 8) and bits determining theduration of an interval D of validity for coefficients G, K₁, K₂ etc.,constitute a set of processing parameters transmitted from computer UEto an input module IN_(a), IN_(b), . . . IN_(n) a multiple 1 and acontrol unit UC which forms an interface between these input modules andthe computer.

Unit UC receives, via a multiple 2 extending from computer UE, timingpulses inducing the loading of parameter signals carried by multiple 1,the latter multiple also transmitting control signals which are decodedby unit UC and serve at least in part for commanding the emission, overleads 5a, 5b, . . . 5n, of activating pulses enabling the selectiveloading of input modules IN_(a), IN_(b), . . . IN_(n) with parametricsignals received from unit UC via a line 4. These modules, as describedhereinafter with respect to FIGS. 2 and 3, emit parameter-requestsignals to processor UE via respective output leads 6a, 6b, . . . 6n,control unit UC and a multiple 3. On a lead 30, extending to controlunit UC, computer UE transmits a verification code confirming thereception of a parameter request.

The operations of synthesizer SIN are correlated by a time base TBemitting selection signals CK_(a), CK_(b), . . . CK_(n) to input modulesIN_(a), IN_(b), . . . IN_(n), respectively, reading signals CK₁ and TR₁to memories EP, EC, and clock pulses CK_(x) (x=1, 2 . . . 5) as well asenabling signals TR_(Y) (y=2, 3 . . . 6) to filter TV.

As shown in FIG. 2, control unit UC comprises a first register RE₁loading, in response to timing pulses carried by a lead 20, parametricsignals transmitted on a lead 10. A second register RE₂ temporarilystores control words arriving on a lead 11, this register being enabledby timing pulses carried on a lead 21. Leads 10, 11 and 20, 21 form partof multiples 1 and 2, respectively. Register RE₁ has an output connectedto line 4, while register RE₂ has a pair of output leads 12, 13extending to n logic circuits L_(la) -L_(ln) associated with respectiveinput modules IN_(a) -IN_(n) and with respective output channels u_(a)-u_(n). Register RE₂ has a further output lead 14 extending to a decoderDE which in turn has output connections 5a-5n working into logiccircuits L_(la) -L_(ln) and into input modules IN_(a) -IN_(n), asheretofore described. Circuits L_(la) -L_(ln) are connected viaassociated leads 15a-15n to respective AND gates P_(a) -P_(n) whoseoutput leads 16a-16n are linked to a read/write memory ME₁ via anencoder COD. This memory has a read-command input from a counter CN fedby the timing pulses on lead 20 and an output tied to computer UE via alead 31 forming part of multiple 3 (FIG. 1). A logic network LN₁ isconnected to memory ME₁ for inforing computer UE, via a lead 32 ofmultiple 3, that memory ME₁ contains at least one message.

Upon the transmission over lead 10 of the first in a sequence ofparameter sets chosen by computer UE for synthesizing a predeterminedvoice signal to be emitted over a selected output channel u_(a) -u_(n),pulses on lead 20 enable the loading of the parameters by register RE₁.A control word simultaneously carried on lead 11 is loaded into registerRE₂ in response to timing pulses on lead 21. This control word includesa bit commanding the initiation of a parameter-set sequence and inducingthe energization of lead 12. A signal emitted over lead 14 causesdecoder DE to energize a lead 5a-5n corresponding to the selected outputchannel, e.g. channel u_(a). Owing to the presence of high-level logicsignals on leads 12 and 5a, circuit L_(la) emits a high-level voltage onlead 15a, thereby enabling gate P_(a) to emit a pulse to encoder COD inresponse to a pulse transmitted from input module IN_(a) over lead 6a.Module IN_(a) will energize lead 6a, as described in detail hereinafterwith reference to FIG. 3, upon detecting the termination of a validityinterval D for a set of parameters already received by module IN_(a)from computer UE. Upon receiving from gate P_(a) a pulse signifying aparameter request from module IN_(a), encoder COD writes in memory ME₁an address code corresponding to channel u_(a). The reception andstorage of the address code is detected by logic network LN₁ andcommunicated thereby to computer UE via lead 32. Upon the counting of apredetermined number of timing pulses indicating the completedtransmission of an entire parameter set via register RE₁, counter CNgenerates a consent signal enabling the reading of an address code frommemory ME₁. This memory is provided with n storage locations, i.e. onefor every channel u_(a) -u_(n).

As shown in FIG. 3, a generic input module IN_(i) representative of allmodules IN_(a) -IN_(n) includes a pair of read/write memories ME₂, ME₃serving as buffer stores for parameter sets arriving over line 4. Lead6i, which carries a parameter request from a validity-interval counterCD, works into memories ME₂, ME₃ for effecting an interchange of writingand reading functions therebetween, so that these memories alternate inthe reception and readout of parameter sets. The energization of lead 6ialso causes the emission to counter CD, via a lead 91 and from thememory ME₂ or ME₃ enabled for reading, of a counter setting determiningthe validity interval D of the parameter set stored by this memory.Memories ME₂, ME₃ have a common output connection 90 extending to anadditional memory ME₄ for transferring parameter sets thereto; thistransfer to memory ME₄ from the buffer memory ME₂ or ME₃ enabled forreading is caused by a sound-interval counter CT via a lead 60. Theemission of a parameter set from memory ME₄ to filter TV via lead 9ioccurs in response to clock signal CK_(i).

Counter CT is connected at a loading input to an electronic switch S₂for receiving a sound-interval count from counter CD via a lead 61 orfrom read-enabled memory ME₂ or ME₃ via multiple 8i. According towhether the energization level of lead 7i indicates that the soundnature of a forthcoming speech sample is to be unvoiced or voiced,switch S₂ presets counter CT with an unvoiced-interval count equal tothe current contents of component CD or with a voiced-interval countdetermined by the pitch-period signals carried by multiple 8i. Thecontents of counters CD, CT are decremented by stepping pulses SPemitted by time base TB.

Upon the loading of a control word into register RE₂ (FIG. 2) and thetransmission to decoder DE of an address code indicating the outputchannel associated with module IN_(i), lead 5i is energized to apply awriting command to buffer memories ME₂, ME₃ (FIG. 3). Let us assume thatthis control word corresponds to a first parameter set in a sequence.Counters CD and CT are then set to measure a predetermined time intervalt₀ -t₁, indicated in FIG. 7, sufficient for the loading of the firstparameter set into the memory ME₂ or ME₃, whichever happens to beenabled for writing; the counters CD, CT are preloaded with a commonsetting T₀ =D₀ at instant t₀. Upon counting out the predeterminedstarting interval t₀ -t₁, counter CD emits on lead 6i a pulse passed bythe associated gate (P_(a) -P_(n), FIG. 2) and converted by encoder CODinto a parameter request transmitted to computer UE via lead 31, asheretofore described. The pulse on lead 6i also interchanges reading andwriting functions between memories ME₂, ME₃ and, if memory ME₂ isassumed to accept the first parameter set, reads onto lead 91 a codegroup or byte from this memory to preload the counter CD with avalidity-interval setting D₁ assigned to this parameter set.

At the same instant t₁ when counter CD emits a pulse on lead 6i, counterCT temporarily energizes lead 60, thereby reading from memory ME₂ ontoleads 90, 7i and 8i respective code groups which represent a set offilter coefficients G(1), K₁ (1), K₂ (1) etc. controlling the processingin filter TV of a first excitation-pulse train, a discriminating signalindicating that the sound nature of a first speech element is voiced,and signals giving a pitch period T₁ for the fundamental frequency ofthis first speech element. The signal carried by lead 7i induces switchS₂ to preload counter CT with a setting corresponding to pitch periodT₁, this counter immediately beginning to decrement the count T₁ tomeasure a time interval t₁ -t₁ '. During this interval the memory ME₄ isrecurrently addressed by clock signal CK_(i), at a rate inverselyproportional to the number n of synthesizer channels u_(a) -u_(n), tofeed coefficients G(1), K₁ (1), K₂ (1) etc. to filter TV for determiningthe processing of excitation pulses transmitted from read-only memory EPaccording to the pitch period T₁.

If there are eight output channels (n=8) and if the synthesizer SIN hasa cycle length of 125 μsec, filter TV will have available an interval ofalmost 16 μsec per cycle for processing, according to weightingcoefficients supplied by memory ME₄, an excitation pulse emitted bymemory EP (FIG. 1) in response to the pitch-period code carried by leads8a, 8. As heretofore described, memory EP is addressed by thispitch-period code and by an enabling signal TR₁ to emit an excitationsignal consisting of T₁ pulses. Generally, the voiced-sound intervalcounted by component CT, as determined by its presetting with thecorresponding pitch-period count T, is substantially greater than theinterval required for the emission of a complete excitation code bymemory EP, whereby 10 to 100 identical excitation codes are processed byfilter TV prior to the reading of another parameter set from buffermemories ME₂, ME₃.

Upon reaching its preset count of T₁, component CT transmits a pulse vialead 60 to memories ME₂ -ME₄. Because component CD has not yet finishedcounting, memories ME₂ and ME₃ are still enabled for reading andwriting, respectively. Thus, the pulse on lead 60 again delivers thesetting T₁ to counter CT and coefficients G(1), K₁ (1), K₂ (1) etc. tomemory ME₄ whereupon the operations implemented during interval t₁ -t₂are repeated in a subsequent interval t₁ '-t₁ " of identical duration.

At an instant t₂ determined by validity-interval setting D₁, counter CDenergizes lead 6i to communicate a parameter-set request to computer UEand to interchange reading and writing operations between memories ME₂and ME₃. A signal carried by lead 91 from memory ME₃ in response to theenergization of lead 6i now preloads counter CD with a setting D₂determining the next interval of validity for the parameters stored inmemory ME₃. These parameters are read from memory ME₃ by counter CT atinstant t₁ " and include a discriminating signal, emitted on lead 7i,indicating the sound of the next synthesized speech element to beunvoiced. This signal reverses switch S₂ to load counter CT with thecurrent contents of counter CD and connects lead 40 (FIG. 1) toread-only memory EC. It is to be noted that, in the illustrative exampleof input-unit operation shown in FIG. 7, interval t₁ "-t₃ is representedwith dashed lines to indicate the emission of unvoiced samples by filterTV; time t₂ -t₃ is similarly represented to indicate a validity intervalfor unvoiced-sound parameters. During interval t₂ -t₃, memory EC emitsat least one excitation signal consisting of pulses of unitary magnitudeand quasi-random polarity to be processed by filter TV according to again coefficient G(2) and reflection coefficients K₁ (2), K₂ (2) etc.which are fed to memory ME₄ upon the energization of lead 60 at instantt₁ " and are subsequently transmitted to filter TV under the control ofclock pulses CK_(i). During interval t₂ -t₃, determined by the count D₂,memory ME₂ receives a new parameter set from computer UE via controlunit UC.

Because counter CT is loaded at instant t₁ " with the contents ofcounter CD, these two components energize their respective output leads60, 6i substantially simultaneously. Consequently, at instant t₃ thecounter CD is preloaded to measure a time t₃ -t₄ according to avalidity-interval setting D₃ transmitted from buffer ME₂ and counter CTis given a setting T₃ determining an interval t₃ -t₃ ', while memory ME₄is fed signals from buffer ME₂ representing a third set of filtercoefficients G(3), K₁ (3), K₂ (3) etc. Signals generated on lead 8irepresent pitch characteristics of a speech element to be synthesizedduring interval t₃ -t₃ ', as well as the setting supplied to counter CT,and induce read-only memory EP to emit excitation signals constituted bya positive pulse of magnitude √T₃ -1 and (T₃ -1) negative pulses ofmagnitude 1/√T₃ -1, as heretofore described with reference to FIG. 1.One excitation pulse is emitted during each synthesizer cycle, i.e. each125 μsec, to be processed into a digital speech sample by filter TV inresponse to weighting coefficients G(3), K₁ (3), K₂ (3) etc. read frommemory ME₄ by clock pulses CK_(i).

At instant t₃ ', owing to validity interval t₃ -t₄ being longer thanvoiced-sound interval t₃ -t₃ ', counter CT again is preloaded with countT₃ and memory ME₄ receives weighting coefficients G(3), K₁ (3), K₂ (3)etc., whereby digital speech samples generated at the output of filterTV during interval t₃ -t₃ ' are represented during a succeeding intervalt₃ '-t₃ ". At instant t₄, counter CD enables buffers ME₂, ME₃ forwriting and for reading, respectively, and receives a setting D₄ whichdetermines the duration of a validity interval t₄ -t₅. During the latterinterval a new parameter set is written into buffer ME₂ ; as indicatedin FIG. 7, however, this set is replaced at instant t₅ by yet anotherset which controls the sound characteristics of a speech elementproduced by synthesizer SIN on the associated output channel during asubsequent interval t₃ "-t₆. Owing to the brief duration of validityinterval t₄ -t₅, the suppression of the corresponding sound is largelyunnoticeable.

The processing of excitation pulses by filter TV is diagrammaticallyillustrated in FIG. 4. To produce a digital speech sample E₁₀ on thelead 41 extending to converter MU (FIG. 1), filter TV forms a productE₀, at a multiplication stage MT, of an incoming excitation pulse and again factor G arriving via lead 9 from one of the input units IN_(a),IN_(b), . . . IN_(n). Product E₀ is then successively diminished atdifferential stages SM₁ of ten functional cells TV₁ to TV₁₀ of filterTV. Stage SM₁ of each of these cells yields a resulting value E₁ to E₁₀formed by subtracting from the result of the operation of the precedingcell MT, TV₁ etc. a product π_(1a) to π_(10a) in turn formed, at arespective multiplication stage ML₁, from a reflection coefficient K₁ toK₁₀ and a sum F₁ to F₁₀, these sums F₁ to F₁₀ being generated byfeedback during the production of a preceding digital speech sample andtemporarily stored at delay stages Z. Each cell TV₂ to TV₁₀ has an adderstage SM₂ at which the sums F₁ to F₉ are derived as algebraiccombinations of the sums at the outputs of delays Z and products π_(2b)to π_(10b) formed at respective multiplication stages ML₂ of cells TV₂to TV₁₀ from filter coefficients K₂ to K₁₀ and from the results E₂ toE₁₀ of subtractor stages SM₁. Thus, filter TV implements the followingequations in processing an excitation pulse E₀ (τ) at a time τ to yielda digital speech sample E₁₀ (τ): ##EQU1## where

    F.sub.j (τ)=E.sub.j (τ)·K.sub.2 (τ)+F.sub.j+l (τ-Δτ)                                      (2)

and Δτ represents the duration of a processing cycle of synthesizer SIN,e.g. 125 μsec. The values of the gain G and the multiplicativereflection coefficients K₁, K₂, . . . K₁₀, which are stored in computerUE and transmitted to filter TV via an input module IN_(a), IN_(b), . .. IN_(n) as discussed above, are determined according to anacoustic-speech-production model as described in various publicationslisted in the aforementioned article by Bertinetto et al, includingSpeech Synthesis by J. L. Flanagan and L. R. Rabiner (Dowden, Hutchinsonand Ross, Stroudsburg, PA., 1973) and On Some Factors Influencing theQuality of Synthesized Speech by C. Scagliola and E. Vivalda (FirstColloque F.A.S.E., Paris, 1975).

An actual filter TV for executing the operation diagrammed in FIG. 4 isshown in FIG. 5. Lead 40 (see FIG. 1) extends to a register RE₃ via ananalog-to-digital converter ADC which changes an incoming excitationpulse into a form suitable for the circuitry of filter TV; if the pulsesemitted by memory EP (FIG. 1) are already coded in binary fashion,converter ADC may be omitted. Another register RE₄ has an inputconnected to lead 9 for receiving values of gain G and coefficients K₁,K₂ etc. from input modules IN_(a) to IN_(n). Both registers RE₃, RE₄feed a multiplier ML₃ working into an output register RE₆. This registerloads an adder SM₃ via a logic network LN₂ for selectively changing thealgebraic sign, in response to the logic level of a changeover signalA/S from time base BT, of products emitted by multiplier ML₃. RegisterRE₆ has an output lead 42 extending to another register RE₅ and to aread/write memory ME₅ wherein reading and writing operations arecontrolled by a time-base signal R/W, register RE₅ and memory ME₅working via a common output lead 41' into adder SM₃ and register RE₃.Adder SM₃ feeds yet another register RE₇ which shares output lead 42with register RE₆.

Registers RE₃, RE₄ and RE₆ receive clock pulses CK₁, CK₂ and TR₄ fortiming the operations of multiplier ML₃ to execute the products E₀,π_(1a) to π_(10a), π_(1b) to π_(10b) of stages MT, ML₁, ML₂ (see FIG.4), while registers RE₆, RE₇ and logic network LN₂ respond to signalsCK₂, CK₄, TR₄, TR₅ and A/S to control the adder SM₃ for producing thedifferences E₁ to E₁₀ and the sums F₁ to F₉ resulting from theoperations performed at filter stages SM₁ and SM₂, respectively. Clockpulses CK₁, CK₂, CK₃ and CK₄ command the loading of registers RE₃ /RE₄,RE₆, RE₅ and RE₇, respectively, while signals TR₂, TR₃, TR₄ and TR₅ arerespectively applied to tristate circuits in register RE₅, memory ME₅,register RE₆ and register RE₇ for enabling the emission of therespective contents thereof onto leads 41' and 42. A further memory ME₆has an input tied to lead 41, extending from register RE₅ to converterMU (FIG. 1), and an output connected via lead 42 to memory ME₅ forfeeding back a result E₁₀ to serve as a sum F₁₀ in a subsequentprocessing of an excitation pulse.

Generally, memory ME₅ stores the sums F₁ to F₁₀, thereby carrying outthe function of delays Z (FIG. 4). Register RE₅ temporarily memorizesthe differences E₀ to E₁₀ during the processing of an excitation pulse.It is to be noted that filter TV performs the additive, subtractive andmultiplicative operations, indicated in FIG. 4, for each speech sampleemitted over any output channel u_(a) -u_(n). These operations areexecuted in a time-division mode under the control of time base TB andwill now be described in detail with reference to FIGS. 4, 5 and 6. InFIG. 6, a high level of read/write signal R/W denotes a reading commandwhile a high level of changeover signal A/S causes a sign inversion.

Let us assume that, at an instant v₁, a channel-selection signal CK_(i)(cf. FIG. 3) coincides with a clock pulse CK₁ and a high level ofenabling signal TR₁, resulting in the emission of an excitation pulsefrom generator memory EP (FIG. 1) to input register RE₃ and the loadingof a gain factor G into register RE₄. During an accommodation intervalof at least 100 nsec, which follows instant v₁, enabling signals TR₂,TR₃ have a low logic level, thereby preventing the reading of algebraicvalues from register R₅ or memory ME₅ to input register RE₃. At aninstant v₂, these signals TR₂, TR₃ taken on a high logic level, therbyallowing memory ME₅ to feed back to that input register the coded sum F₁(calculated in the preceding subcycle assigned to the selected channel)and commanding output register RE₆ to transmit the product E₀ frommultiplier ML₃ onto lead 42. Upon the generation of clock pulses CK₁,CK₂ at an instant v₃, registers RE₃, RE₄ load sum F₁ and reflectioncoefficient K₁ from memory ME₅ and input module IN_(i), respectively;register RE₆ memorizes the product E₀ present at the output ofmultiplier ML₃, this product being transferred to register RE₅ inresponse to a clock pulse CK₃ at an instant v₄. At the same instant thelogic level of signal TR₃ goes low, thereby disconnecting memory ME₅from output lead 41'.

An increase of the voltage of signal TR₂ at an instant v₅ enables thetransfer of product E₀ from register RE₅ to adder SM₃ via lead 41'. Thenext clock pulse CK₂, following after a 100-nsec delay, causes theloading of product π_(1a) into register RE₆. Because this register isalready enabled by signal TR₄ and because logic network LN₂ is receivinga high-level signal A/S, product π_(1a) is transmitted to adder SM₃ forsubtraction from product E₀, the resulting difference E₁ beingtemporarily stored in register RE₇ in response to a clock pulse CK₄ atan instant v₇. Simultaneously with the rising edge of this pulse, thelogic levels of signals TR₂, TR₄ fall and the logic levels of signalsTR₃, TR₅ rise, whereby registers RE₅, RE₆ are prevented from emittingsignals onto leads 41', 42 whereas memory ME₅ and register RE₇ areenabled to feed back the coded algebraic values F₂, E₁ to registers RE₃,RE₅, respectively. At a subsequent instant v₈, clock pulses CK₁ and CK₃induce the transfer of difference E₁ to register RE₅ and the loading ofsum F₂ and of coefficient K₂ into registers RE₃ and RE₄ for transmissionto multiplier ML₃ to form the product π_(2a). Upon the reading of sum F₂to register RE₃ and the emission of difference E₁ from register RE₇,signals TR₃, TR₅ assume a low level (instant v₉) to disconnect units ME₅and RE₅ from output leads 41' and 42. Signals TR₂ and TR₄ then resume,at an instant v₁₀, their high levels for enabling the transmission ofdifference E₁ to adder SM₃ and of product π_(2a) from multiplier ML₃ viaregister RE₆ and logic network LN₂ to adder SM₃. Because signal A/S hasa high level between instants v₁₁ and v₁₂, the algebraic sign of productπ_(2a) is inverted by logic network LN₂ and the result loaded at instantv₁₂ into register RE₇ is a difference E₂. The feeding of product π_(2a)to output register RE₆ is commanded by a clock pulse CK₂ at instant v₁₁,this instant terminating a first processing phase symbolized by thefirst filter cell TV₁ of FIG. 4.

Enabling signals TR₄, TR₅ go low and high, respectively, at instant v₁₂,thereby inhibiting further transmission from register RE₆ but allowingregister RE₇ to generate on lead 42 a pulse code representing the valueof difference E₂. An ensuing clock pulse CK₃ (at an instant v₁₃) loadsthe value of this difference into register RE₅. Owing to the high logiclevel of enabling signal TR₂, register RE₅ transfers difference E₂ tounit RE₃ upon the appearance of a clock pulse CK₁ at an instant v₁₄.This clock pulse also causes the loading of reflection coefficient K₂into register RE₄. During an ensuing interval v₁₄ -v₁₇, multiplier ML₃forms product π_(2b). The common output lead 41' is disconnected fromregister RE₅ and connected to memory ME₅ in response to the changinglevels of signals TR₂ and TR₃ at an instant v₁₅ whereby sum F₃ is fedback to register RE₃.

At an instant v₁₆, signals A/S and TR₄ assume low and high logic levels,respectively, thereby enabling the transfer of product π_(2b) withoutsign change from multiplier ML₃ to adder SM₃ upon the generation ofclock pulse CK₂ at instant v₁₇. At the same instant a clock pulse CK₁loads register RE₃ with sum F₃ (calculated during the processing of thepreceding excitation pulse assigned to the output channel hereconsidered) and register RE₄ with coefficient K₃, the product π_(3a)formed from sum F₃ and coefficient K₃ being stored in register RE₆ at aninstant v₂₁. Clock pulse CK₄ at an instant v₁₈ induces the temporarymemorization by register RE₇ of the newly formed sum F₁. The passing, atinstant v₁₈, of signal TR₅ to a high logic level enables thetransmission of the new sum F₁ from register RE₇ to memory ME₅ upon theappearance, at an instant v₁₉, of a writing command in the form of a lowlevel of signal R/W. The enabling of register RE₇ by signal TR₅coincides with the return of changeover signal A/S to a high level,switching adder SM₃ to its subtractive mode, and the return of enablingsignal TR₄ to a low level.

The subsequent processing phases of filter TV, corresponding tointermediate cells TV₃ to TV₉ omitted in FIG. 4 but indicated in FIG. 6,are the same as the operations symbolized by cell TV₂ occurring betweeninstants v₁₁ and v₂₁ as described above. At an instant v₂₂, marking thebeginning of a final calculation phase symbolized by the tenth cellTV₁₀, a clock pulse CK₂ loads product π_(10a) into register RE₆. Owingto the high levels of changeover and enabling signals A/S and TR₄, thesign of the product is inverted in logic network LN₂ upon transmissionthereto by register RE₆. Adder SM₃ subtracts the product π_(10a) fromthe difference E₉ (temporarily stored in register RE₅) to produce thedifference E₁₀. At an instant v₂₃, signals CK₄, TR₄ and TR₅ assume high,low and high logic levels, respectively, whereby register RE₇ receivesdifference E₁₀ and is enabled to transfer it to register RE₅ upon theappearance of a clock pulse CK₃ at an instant v₂₄. At a subsequent timev₂₅ a clock pulse CK₅ enables the transfer of difference E₁₀ toconverter MU (see FIG. 1) and to buffer memory ME₆ while a clock pulseCK₁ loads registers RE₃ and RE₄ with difference E₁₀ and coefficient K₁₀,respectively, to be fed to multiplier ML₃ for the implementation ofproduct π_(10b). The altering of the voltage levels of signals TR₂, TR₃at a time v₂₆ blocks any emission from register RE₅ over lead 41' andenables the transfer of sum F₁₀ (from the previous processing subcycle)to adder SM₃.

With enabling signal A/S going low and enabling signal TR₄ going high atan instant v₂₇, the appearance of a clock pulse CK₂ at an instant v₂₈causes product π_(10b) to be transmitted without change in sign to adderSM₃ for combination with sum F₁₀ to form a new sum F₉ which is thenstored in register RE₇ in response to a pulse CK₄ at an instant v₂₉. Atthe latter instant the levels of signals TR₂ and TR₅ go high and thelevels of signals TR₃ and TR₄ go low, whereby the new sum F₉ is loadedinto register RE₅. Because signal TR₂ is high, a writing pulse at a timev₃₀ enables the transfer of sum F₉ to memory ME₅. A subsequent writingpulse (instant v₃₂), occurring after the appearance of a clock pulse CK₆enabling the connection of memory ME₆ to output lead 42, causes thestorage in memory ME₅ of difference E₁₀, which will serve as sum F₁₀ inthe next processing subcycle assigned to the output channel hereconsidered. The current subcycle terminates upon the return of signalCK_(i) to a low logic level at a time v₃₃. The next subcycle begins atthis time v₃₃ and is assigned to another output channel identified bythe immediately following selection pulse CK_(a) -CK_(n).

We claim:
 1. A digital speech synthesizer comprising:pulse-generatingmeans for emitting excitation pulses of varying amplitudes andpolarities; a lattice filter operatively connected to saidpulse-generating means for producing digital speech samples in responseto said excitation pulses; a digit-to-analog converter at an output ofsaid filter for translating said samples into voice signals; aprogrammed source of stored sets of processing parameters transmittable,in a predetermined sequence of sets, to said pulse-generating means forcommanding the emission of said excitation pulses and to said filter forcontrolling the processing of said excitation pulses thereby, saidparameters encoding information relating to frequency distribution,volume and duration of speech elements; input means operativelyconnected to said pulse-generating means, to said filter and to saidsource for facilitating the transmission of consecutive sets of saidsequence from said source to said pulse-generating means and to saidfilter, thereby producing consecutive speech elements of a voice signalcoded by said sequence, said input means including counting means forcontrolling the respective durations of said consecutive speech elementsaccording to settings for said counting means transmitted together withsaid parameters from said source, said setting establishing differentvalidity intervals for said sets; and timing means operatively connectedto said input means, to said filter and to said pulse-generating meansfor correlating the operations thereof.
 2. A synthesizer as defined inclaim 1 wherein said pulse-generating means includes a first generatoradapted to emit digitized amplitude samples of alternating waveforms toproduce voiced speech elements and a second generator adapted to emitconstant-amplitude pulses free from recognizable periodicity to produceunvoiced speech elements, said parameters including a discriminatingsignal for selectively enabling either one of said generators.
 3. Asynthesizer as defined in claim 2 wherein said input means includes aplurality of input units associated with respective output channels,said timing means being connected to said input units for individuallyactivating same one at a time, said timing means controlling saidpulse-generating means and said filter in a time-division mode.
 4. Asynthesizer as defined in claim 3, further comprising a control unitforming an interface between said source and said input units fortemporarily storing parameter-set requests therefrom and fordistributing parameter sets from said source to respective input unitsselected according to address information supplied by said source.
 5. Asynthesizer as defined in claim 3 or 4 wherein each of said input unitsfurther includes a pair of buffer memories for temporarily andalternately storing successive parameter sets from said source, saidcounting means being connected to said memories for enabling aninterchange of reading and writing functions therebetween upon detectingthe termination of a current validity interval.
 6. A synthesizer asdefined in claim 5 wherein said counting means includes avalidity-interval counter and further includes a sound-interval counterfor determining the end of voiced intervals and of unvoiced intervals;said input means further comprising a switch operating in response tosaid discriminating signal, stored in either of said buffer memories, tocontrol the loading of said sound-interval counter withunvoiced-interval settings corresponding to the contents of saidvalidity-interval counter and with pitch-period settings stored ineither of said buffer memories representing frequency characteristics ofvoiced speech elements, and an additional memory for temporarily storingfilter coefficients and sound-intensity data transmitted from saidbuffer memories in response to a reading signal generated by saidsound-interval counter upon detecting the termination of a current soundinterval, said additional memory being responsive to clock pulses fromsaid timing means for transmitting said coefficients to said filter. 7.A synthesizer as defined in claim 4 wherein said control unit includes alogic network for enabling the transfer of a parameter request from aninput unit to said source only upon receiving therefrom consent signalsindicating completion of an ongoing transmission of a parameter-setsequence to such input unit.
 8. A synthesizer as defined in claim 7wherein said control unit further includes register means fortemporarily storing parameters for said source and a series-to-parallelconverter for decoding address signals received from said source toenable the transmission of parameters from said register means to aselected input unit.
 9. A synthesizer as defined in claim 7 wherein saidcontrol unit further includes a parallel-to-series converter forencoding addresses of request-emitting input units and a read/writememory at the output of said parallel-to-series converter fortemporarily storing said addresses prior to emission thereof to saidsource in response to a ready signal therefrom.
 10. A synthesizer asdefined in claims 1, 2, 3, 4, 7, 8 or 9 wherein said filter includes adigital multiplier, a digital adder and storage means for generating adigital speech sample as a sum of terms including an excitation sampleweighted by a sound-intensity coefficient and at least one term formedas a product between a reflection coefficient and a preceding digitalspeech sample.